Overview

Brought to you by YData

Dataset statistics

Number of variables24
Number of observations182242
Missing cells736673
Missing cells (%)16.8%
Duplicate rows3051
Duplicate rows (%)1.7%
Total size in memory33.4 MiB
Average record size in memory192.0 B

Variable types

Categorical4
Numeric13
Text7

Alerts

Dataset has 3051 (1.7%) duplicate rowsDuplicates
BDRM_COND is highly overall correlated with GROSS_AREA and 2 other fieldsHigh correlation
BED_RMS is highly overall correlated with FULL_BTH and 5 other fieldsHigh correlation
CITY is highly overall correlated with ZIP_CODEHigh correlation
FULL_BTH is highly overall correlated with BED_RMS and 2 other fieldsHigh correlation
GROSS_AREA is highly overall correlated with BDRM_COND and 5 other fieldsHigh correlation
KITCHENS is highly overall correlated with BED_RMS and 2 other fieldsHigh correlation
LIVING_AREA is highly overall correlated with BDRM_COND and 4 other fieldsHigh correlation
NUM_PARKING is highly overall correlated with BDRM_COND and 1 other fieldsHigh correlation
RES_FLOOR is highly overall correlated with BED_RMS and 3 other fieldsHigh correlation
STRUCTURE_CLASS is highly overall correlated with YR_BUILTHigh correlation
TT_RMS is highly overall correlated with BED_RMS and 5 other fieldsHigh correlation
YR_BUILT is highly overall correlated with STRUCTURE_CLASSHigh correlation
ZIP_CODE is highly overall correlated with CITYHigh correlation
NUM_BLDGS is highly imbalanced (99.9%) Imbalance
BDRM_COND is highly imbalanced (61.9%) Imbalance
BLDG_TYPE has 2616 (1.4%) missing values Missing
RES_FLOOR has 33792 (18.5%) missing values Missing
LAND_SF has 8002 (4.4%) missing values Missing
GROSS_AREA has 33848 (18.6%) missing values Missing
LIVING_AREA has 34141 (18.7%) missing values Missing
YR_BUILT has 22786 (12.5%) missing values Missing
YR_REMODEL has 95524 (52.4%) missing values Missing
STRUCTURE_CLASS has 164836 (90.4%) missing values Missing
BED_RMS has 48765 (26.8%) missing values Missing
FULL_BTH has 11644 (6.4%) missing values Missing
HLF_BTH has 11509 (6.3%) missing values Missing
KITCHENS has 11718 (6.4%) missing values Missing
TT_RMS has 48829 (26.8%) missing values Missing
BDRM_COND has 110500 (60.6%) missing values Missing
FIREPLACES has 49534 (27.2%) missing values Missing
NUM_PARKING has 48623 (26.7%) missing values Missing
GROSS_AREA is highly skewed (γ1 = 55.89596558) Skewed
LIVING_AREA is highly skewed (γ1 = 63.65241616) Skewed
YR_BUILT is highly skewed (γ1 = 146.1067555) Skewed
YR_REMODEL is highly skewed (γ1 = 248.0727809) Skewed
NUM_PARKING is highly skewed (γ1 = 29.67993455) Skewed
BED_RMS has 3184 (1.7%) zeros Zeros
FULL_BTH has 36940 (20.3%) zeros Zeros
HLF_BTH has 135661 (74.4%) zeros Zeros
KITCHENS has 36863 (20.2%) zeros Zeros
FIREPLACES has 96980 (53.2%) zeros Zeros
NUM_PARKING has 58524 (32.1%) zeros Zeros

Reproduction

Analysis started2024-11-05 22:53:06.970948
Analysis finished2024-11-05 22:53:46.170117
Duration39.2 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

CITY
Categorical

High correlation 

Distinct19
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size1.4 MiB
BOSTON
47713 
DORCHESTER
29328 
SOUTH BOSTON
15622 
JAMAICA PLAIN
12147 
BRIGHTON
12113 
Other values (14)
65316 

Length

Max length16
Median length12
Mean length9.206937
Min length6

Characters and Unicode

Total characters1677863
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowEAST BOSTON
2nd rowEAST BOSTON
3rd rowEAST BOSTON
4th rowEAST BOSTON
5th rowEAST BOSTON

Common Values

ValueCountFrequency (%)
BOSTON 47713
26.2%
DORCHESTER 29328
16.1%
SOUTH BOSTON 15622
 
8.6%
JAMAICA PLAIN 12147
 
6.7%
BRIGHTON 12113
 
6.6%
WEST ROXBURY 11006
 
6.0%
EAST BOSTON 10233
 
5.6%
ROSLINDALE 9279
 
5.1%
HYDE PARK 9192
 
5.0%
CHARLESTOWN 7252
 
4.0%
Other values (9) 18354
 
10.1%

Length

2024-11-05T17:53:46.330033image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
boston 73568
30.2%
dorchester 29328
 
12.1%
roxbury 18991
 
7.8%
south 15622
 
6.4%
plain 12147
 
5.0%
jamaica 12147
 
5.0%
brighton 12113
 
5.0%
west 11006
 
4.5%
east 10233
 
4.2%
roslindale 9279
 
3.8%
Other values (12) 38868
16.0%

Most occurring characters

ValueCountFrequency (%)
O 246073
14.7%
T 175338
10.5%
S 165454
9.9%
R 136346
 
8.1%
N 126567
 
7.5%
E 106670
 
6.4%
B 104696
 
6.2%
A 103595
 
6.2%
H 75547
 
4.5%
61063
 
3.6%
Other values (14) 376514
22.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1677863
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
O 246073
14.7%
T 175338
10.5%
S 165454
9.9%
R 136346
 
8.1%
N 126567
 
7.5%
E 106670
 
6.4%
B 104696
 
6.2%
A 103595
 
6.2%
H 75547
 
4.5%
61063
 
3.6%
Other values (14) 376514
22.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1677863
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
O 246073
14.7%
T 175338
10.5%
S 165454
9.9%
R 136346
 
8.1%
N 126567
 
7.5%
E 106670
 
6.4%
B 104696
 
6.2%
A 103595
 
6.2%
H 75547
 
4.5%
61063
 
3.6%
Other values (14) 376514
22.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1677863
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
O 246073
14.7%
T 175338
10.5%
S 165454
9.9%
R 136346
 
8.1%
N 126567
 
7.5%
E 106670
 
6.4%
B 104696
 
6.2%
A 103595
 
6.2%
H 75547
 
4.5%
61063
 
3.6%
Other values (14) 376514
22.4%

ZIP_CODE
Real number (ℝ)

High correlation 

Distinct37
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean2129.8679
Minimum2026
Maximum2467
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:46.481107image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2026
5-th percentile2111
Q12119
median2127
Q32131
95-th percentile2136
Maximum2467
Range441
Interquartile range (IQR)12

Descriptive statistics

Standard deviation30.721915
Coefficient of variation (CV)0.014424328
Kurtosis81.218517
Mean2129.8679
Median Absolute Deviation (MAD)5
Skewness8.1238643
Sum3.88145 × 108
Variance943.83603
MonotonicityNot monotonic
2024-11-05T17:53:46.622230image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
2127 15656
 
8.6%
2130 12154
 
6.7%
2135 12114
 
6.6%
2124 11124
 
6.1%
2132 11004
 
6.0%
2128 10231
 
5.6%
2116 9649
 
5.3%
2118 9384
 
5.1%
2131 9283
 
5.1%
2136 9192
 
5.0%
Other values (27) 72448
39.8%
ValueCountFrequency (%)
2026 6
 
< 0.1%
2108 2172
 
1.2%
2109 1847
 
1.0%
2110 2487
 
1.4%
2111 2893
 
1.6%
2113 2357
 
1.3%
2114 5352
2.9%
2115 5548
3.0%
2116 9649
5.3%
2118 9384
5.1%
ValueCountFrequency (%)
2467 1017
 
0.6%
2458 1
 
< 0.1%
2446 11
 
< 0.1%
2445 13
 
< 0.1%
2219 1
 
< 0.1%
2215 3649
2.0%
2210 2132
1.2%
2201 3
 
< 0.1%
2199 36
 
< 0.1%
2137 2
 
< 0.1%

NUM_BLDGS
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
1
182228 
2
 
14

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters182242
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%

Length

2024-11-05T17:53:46.765214image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T17:53:46.909333image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 182242
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 182242
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 182242
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 182228
> 99.9%
2 14
 
< 0.1%
Distinct194
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:47.301818image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length16.888231
Min length5

Characters and Unicode

Total characters3077745
Distinct characters68
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st rowTHREE-FAM DWELLING
2nd rowTHREE-FAM DWELLING
3rd rowTHREE-FAM DWELLING
4th rowTHREE-FAM DWELLING
5th rowTWO-FAM DWELLING
ValueCountFrequency (%)
condo 92825
21.9%
residential 72979
17.2%
dwelling 60551
14.3%
single 30439
 
7.2%
fam 30439
 
7.2%
two-fam 16814
 
4.0%
res 15805
 
3.7%
three-fam 13298
 
3.1%
main 10768
 
2.5%
parking 8987
 
2.1%
Other values (264) 71041
16.8%
2024-11-05T17:53:47.901984image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 306547
 
10.0%
N 293643
 
9.5%
I 277658
 
9.0%
L 247639
 
8.0%
241900
 
7.9%
D 239004
 
7.8%
O 225960
 
7.3%
A 176683
 
5.7%
S 140752
 
4.6%
T 130272
 
4.2%
Other values (58) 797687
25.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3077745
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 306547
 
10.0%
N 293643
 
9.5%
I 277658
 
9.0%
L 247639
 
8.0%
241900
 
7.9%
D 239004
 
7.8%
O 225960
 
7.3%
A 176683
 
5.7%
S 140752
 
4.6%
T 130272
 
4.2%
Other values (58) 797687
25.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3077745
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 306547
 
10.0%
N 293643
 
9.5%
I 277658
 
9.0%
L 247639
 
8.0%
241900
 
7.9%
D 239004
 
7.8%
O 225960
 
7.3%
A 176683
 
5.7%
S 140752
 
4.6%
T 130272
 
4.2%
Other values (58) 797687
25.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3077745
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 306547
 
10.0%
N 293643
 
9.5%
I 277658
 
9.0%
L 247639
 
8.0%
241900
 
7.9%
D 239004
 
7.8%
O 225960
 
7.3%
A 176683
 
5.7%
S 140752
 
4.6%
T 130272
 
4.2%
Other values (58) 797687
25.9%

BLDG_TYPE
Text

Missing 

Distinct201
Distinct (%)0.1%
Missing2616
Missing (%)1.4%
Memory size1.4 MiB
2024-11-05T17:53:48.108420image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length13.795586
Min length1

Characters and Unicode

Total characters2478046
Distinct characters69
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12 ?
Unique (%)< 0.1%

Sample

1st rowRE - Row End
2nd rowRM - Row Middle
3rd rowRM - Row Middle
4th rowRM - Row Middle
5th rowRE - Row End
ValueCountFrequency (%)
168505
26.6%
rise 38625
 
6.1%
row 26473
 
4.2%
rm 17763
 
2.8%
middle 17698
 
2.8%
cl 16789
 
2.6%
colonial 16789
 
2.6%
lr 15448
 
2.4%
low 15448
 
2.4%
mr 15194
 
2.4%
Other values (462) 285650
45.0%
2024-11-05T17:53:48.518820image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
454951
18.4%
- 179927
 
7.3%
R 147705
 
6.0%
e 145112
 
5.9%
i 129604
 
5.2%
o 126568
 
5.1%
n 106689
 
4.3%
d 85062
 
3.4%
a 82052
 
3.3%
l 79352
 
3.2%
Other values (59) 941024
38.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2478046
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
454951
18.4%
- 179927
 
7.3%
R 147705
 
6.0%
e 145112
 
5.9%
i 129604
 
5.2%
o 126568
 
5.1%
n 106689
 
4.3%
d 85062
 
3.4%
a 82052
 
3.3%
l 79352
 
3.2%
Other values (59) 941024
38.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2478046
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
454951
18.4%
- 179927
 
7.3%
R 147705
 
6.0%
e 145112
 
5.9%
i 129604
 
5.2%
o 126568
 
5.1%
n 106689
 
4.3%
d 85062
 
3.4%
a 82052
 
3.3%
l 79352
 
3.2%
Other values (59) 941024
38.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2478046
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
454951
18.4%
- 179927
 
7.3%
R 147705
 
6.0%
e 145112
 
5.9%
i 129604
 
5.2%
o 126568
 
5.1%
n 106689
 
4.3%
d 85062
 
3.4%
a 82052
 
3.3%
l 79352
 
3.2%
Other values (59) 941024
38.0%

RES_FLOOR
Real number (ℝ)

High correlation  Missing 

Distinct48
Distinct (%)< 0.1%
Missing33792
Missing (%)18.5%
Infinite0
Infinite (%)0.0%
Mean1.8803537
Minimum0
Maximum62
Zeros31
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:48.695187image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32.5
95-th percentile3
Maximum62
Range62
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.1290417
Coefficient of variation (CV)0.60044113
Kurtosis321.52964
Mean1.8803537
Median Absolute Deviation (MAD)1
Skewness9.6638775
Sum279138.5
Variance1.2747351
MonotonicityNot monotonic
2024-11-05T17:53:48.882876image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
1 63070
34.6%
2 42412
23.3%
3 24604
 
13.5%
2.5 7712
 
4.2%
4 4616
 
2.5%
1.5 3505
 
1.9%
5 1360
 
0.7%
3.5 487
 
0.3%
6 237
 
0.1%
4.5 105
 
0.1%
Other values (38) 342
 
0.2%
(Missing) 33792
18.5%
ValueCountFrequency (%)
0 31
 
< 0.1%
1 63070
34.6%
1.5 3505
 
1.9%
2 42412
23.3%
2.5 7712
 
4.2%
3 24604
 
13.5%
3.5 487
 
0.3%
4 4616
 
2.5%
4.5 105
 
0.1%
5 1360
 
0.7%
ValueCountFrequency (%)
62 1
 
< 0.1%
60 2
< 0.1%
46 3
< 0.1%
45 1
 
< 0.1%
41 2
< 0.1%
40 1
 
< 0.1%
39 2
< 0.1%
36 1
 
< 0.1%
35 1
 
< 0.1%
33 1
 
< 0.1%

LAND_SF
Text

Missing 

Distinct17559
Distinct (%)10.1%
Missing8002
Missing (%)4.4%
Memory size1.4 MiB
2024-11-05T17:53:49.320318image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length11
Median length5
Mean length4.5646235
Min length3

Characters and Unicode

Total characters795340
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7616 ?
Unique (%)4.4%

Sample

1st row1,150
2nd row1,150
3rd row1,150
4th row1,150
5th row2,010
ValueCountFrequency (%)
5,000 2392
 
1.4%
4,000 1353
 
0.8%
2,500 862
 
0.5%
6,000 858
 
0.5%
4,500 671
 
0.4%
5,500 641
 
0.4%
3,600 536
 
0.3%
3,200 492
 
0.3%
3,000 490
 
0.3%
2,000 347
 
0.2%
Other values (17549) 165598
95.0%
2024-11-05T17:53:49.944553image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 130489
16.4%
0 106022
13.3%
1 93910
11.8%
5 73860
9.3%
2 66439
8.4%
4 60552
7.6%
3 59000
7.4%
6 57057
7.2%
7 52775
6.6%
8 50775
 
6.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 795340
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
, 130489
16.4%
0 106022
13.3%
1 93910
11.8%
5 73860
9.3%
2 66439
8.4%
4 60552
7.6%
3 59000
7.4%
6 57057
7.2%
7 52775
6.6%
8 50775
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 795340
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
, 130489
16.4%
0 106022
13.3%
1 93910
11.8%
5 73860
9.3%
2 66439
8.4%
4 60552
7.6%
3 59000
7.4%
6 57057
7.2%
7 52775
6.6%
8 50775
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 795340
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
, 130489
16.4%
0 106022
13.3%
1 93910
11.8%
5 73860
9.3%
2 66439
8.4%
4 60552
7.6%
3 59000
7.4%
6 57057
7.2%
7 52775
6.6%
8 50775
 
6.4%

GROSS_AREA
Real number (ℝ)

High correlation  Missing  Skewed 

Distinct13098
Distinct (%)8.8%
Missing33848
Missing (%)18.6%
Infinite0
Infinite (%)0.0%
Mean5434.6768
Minimum3
Maximum6982322
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:50.148893image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile544
Q1967
median2085
Q34008
95-th percentile7770.7
Maximum6982322
Range6982319
Interquartile range (IQR)3041

Descriptive statistics

Standard deviation41322.818
Coefficient of variation (CV)7.6035465
Kurtosis6477.9546
Mean5434.6768
Median Absolute Deviation (MAD)1271
Skewness55.895966
Sum8.0647342 × 108
Variance1.7075753 × 109
MonotonicityNot monotonic
2024-11-05T17:53:50.332321image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
780 282
 
0.2%
600 237
 
0.1%
700 230
 
0.1%
625 220
 
0.1%
690 214
 
0.1%
800 213
 
0.1%
1050 211
 
0.1%
760 210
 
0.1%
775 203
 
0.1%
730 197
 
0.1%
Other values (13088) 146177
80.2%
(Missing) 33848
 
18.6%
ValueCountFrequency (%)
3 1
 
< 0.1%
4 1
 
< 0.1%
25 1
 
< 0.1%
42 1
 
< 0.1%
60 1
 
< 0.1%
82 1
 
< 0.1%
90 1
 
< 0.1%
100 125
0.1%
102 1
 
< 0.1%
106 1
 
< 0.1%
ValueCountFrequency (%)
6982322 1
< 0.1%
3064910 1
< 0.1%
2948448 1
< 0.1%
2481232 1
< 0.1%
2310322 1
< 0.1%
1976650 1
< 0.1%
1970176 1
< 0.1%
1933059 1
< 0.1%
1772572 1
< 0.1%
1726152 1
< 0.1%

LIVING_AREA
Real number (ℝ)

High correlation  Missing  Skewed 

Distinct21808
Distinct (%)14.7%
Missing34141
Missing (%)18.7%
Infinite0
Infinite (%)0.0%
Mean4437.7612
Minimum2
Maximum6982322
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:50.507191image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile542
Q1942
median1483.5
Q32600
95-th percentile5768
Maximum6982322
Range6982320
Interquartile range (IQR)1658

Descriptive statistics

Standard deviation38453.214
Coefficient of variation (CV)8.665003
Kurtosis8373.4206
Mean4437.7612
Median Absolute Deviation (MAD)689.5
Skewness63.652416
Sum6.5723688 × 108
Variance1.4786497 × 109
MonotonicityNot monotonic
2024-11-05T17:53:50.689974image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
780 291
 
0.2%
800 263
 
0.1%
1008 261
 
0.1%
1050 250
 
0.1%
1224 243
 
0.1%
600 239
 
0.1%
700 235
 
0.1%
625 223
 
0.1%
1000 222
 
0.1%
960 221
 
0.1%
Other values (21798) 145653
79.9%
(Missing) 34141
 
18.7%
ValueCountFrequency (%)
2 1
 
< 0.1%
25 1
 
< 0.1%
42 1
 
< 0.1%
82 1
 
< 0.1%
90 1
 
< 0.1%
100 122
0.1%
102 1
 
< 0.1%
106 1
 
< 0.1%
108 1
 
< 0.1%
112 4
 
< 0.1%
ValueCountFrequency (%)
6982322 1
< 0.1%
2898078 1
< 0.1%
2882794 1
< 0.1%
2413114 1
< 0.1%
2310322 1
< 0.1%
1940476 1
< 0.1%
1885420 1
< 0.1%
1694084 1
< 0.1%
1595056 1
< 0.1%
1504200 1
< 0.1%
Distinct16658
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:51.025329image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length11
Median length1
Mean length3.909697
Min length1

Characters and Unicode

Total characters712511
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9985 ?
Unique (%)5.5%

Sample

1st row197,600
2nd row198,500
3rd row199,100
4th row199,700
5th row230,200
ValueCountFrequency (%)
0 94289
51.7%
218,300 66
 
< 0.1%
203,200 63
 
< 0.1%
238,200 59
 
< 0.1%
250,000 58
 
< 0.1%
229,900 57
 
< 0.1%
218,500 57
 
< 0.1%
240,900 57
 
< 0.1%
239,800 57
 
< 0.1%
233,500 57
 
< 0.1%
Other values (16648) 87422
48.0%
2024-11-05T17:53:51.492901image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 288645
40.5%
, 94143
 
13.2%
2 63438
 
8.9%
1 51469
 
7.2%
3 39788
 
5.6%
4 32613
 
4.6%
5 30068
 
4.2%
6 28788
 
4.0%
7 28041
 
3.9%
8 27791
 
3.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 712511
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 288645
40.5%
, 94143
 
13.2%
2 63438
 
8.9%
1 51469
 
7.2%
3 39788
 
5.6%
4 32613
 
4.6%
5 30068
 
4.2%
6 28788
 
4.0%
7 28041
 
3.9%
8 27791
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 712511
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 288645
40.5%
, 94143
 
13.2%
2 63438
 
8.9%
1 51469
 
7.2%
3 39788
 
5.6%
4 32613
 
4.6%
5 30068
 
4.2%
6 28788
 
4.0%
7 28041
 
3.9%
8 27791
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 712511
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 288645
40.5%
, 94143
 
13.2%
2 63438
 
8.9%
1 51469
 
7.2%
3 39788
 
5.6%
4 32613
 
4.6%
5 30068
 
4.2%
6 28788
 
4.0%
7 28041
 
3.9%
8 27791
 
3.9%
Distinct28352
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:51.945003image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length13
Median length7
Mean length6.5419278
Min length1

Characters and Unicode

Total characters1192214
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14035 ?
Unique (%)7.7%

Sample

1st row594,400
2nd row619,700
3rd row605,300
4th row535,600
5th row501,400
ValueCountFrequency (%)
0 20018
 
11.0%
200 2213
 
1.2%
60,000 674
 
0.4%
40,000 565
 
0.3%
43,000 481
 
0.3%
90,000 342
 
0.2%
38,000 308
 
0.2%
74,600 305
 
0.2%
48,000 280
 
0.2%
108,000 275
 
0.2%
Other values (28342) 156781
86.0%
2024-11-05T17:53:52.468226image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 396479
33.3%
, 187305
15.7%
4 77028
 
6.5%
3 74480
 
6.2%
1 74375
 
6.2%
5 71824
 
6.0%
6 67827
 
5.7%
2 67112
 
5.6%
7 62524
 
5.2%
8 58227
 
4.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1192214
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 396479
33.3%
, 187305
15.7%
4 77028
 
6.5%
3 74480
 
6.2%
1 74375
 
6.2%
5 71824
 
6.0%
6 67827
 
5.7%
2 67112
 
5.6%
7 62524
 
5.2%
8 58227
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1192214
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 396479
33.3%
, 187305
15.7%
4 77028
 
6.5%
3 74480
 
6.2%
1 74375
 
6.2%
5 71824
 
6.0%
6 67827
 
5.7%
2 67112
 
5.6%
7 62524
 
5.2%
8 58227
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1192214
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 396479
33.3%
, 187305
15.7%
4 77028
 
6.5%
3 74480
 
6.2%
1 74375
 
6.2%
5 71824
 
6.0%
6 67827
 
5.7%
2 67112
 
5.6%
7 62524
 
5.2%
8 58227
 
4.9%
Distinct32201
Distinct (%)17.7%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:52.815760image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.0112323
Min length1

Characters and Unicode

Total characters1277741
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15263 ?
Unique (%)8.4%

Sample

1st row792,000
2nd row818,200
3rd row804,400
4th row735,300
5th row731,600
ValueCountFrequency (%)
0 10774
 
5.9%
60,000 681
 
0.4%
40,000 580
 
0.3%
43,000 491
 
0.3%
90,000 345
 
0.2%
38,000 318
 
0.2%
74,600 307
 
0.2%
48,000 293
 
0.2%
108,000 277
 
0.2%
47,000 272
 
0.1%
Other values (32191) 167904
92.1%
2024-11-05T17:53:53.329558image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 413329
32.3%
, 212603
16.6%
1 90965
 
7.1%
4 76178
 
6.0%
5 74929
 
5.9%
6 73504
 
5.8%
3 71845
 
5.6%
7 69909
 
5.5%
2 67851
 
5.3%
8 64956
 
5.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1277741
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 413329
32.3%
, 212603
16.6%
1 90965
 
7.1%
4 76178
 
6.0%
5 74929
 
5.9%
6 73504
 
5.8%
3 71845
 
5.6%
7 69909
 
5.5%
2 67851
 
5.3%
8 64956
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1277741
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 413329
32.3%
, 212603
16.6%
1 90965
 
7.1%
4 76178
 
6.0%
5 74929
 
5.9%
6 73504
 
5.8%
3 71845
 
5.6%
7 69909
 
5.5%
2 67851
 
5.3%
8 64956
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1277741
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 413329
32.3%
, 212603
16.6%
1 90965
 
7.1%
4 76178
 
6.0%
5 74929
 
5.9%
6 73504
 
5.8%
3 71845
 
5.6%
7 69909
 
5.5%
2 67851
 
5.3%
8 64956
 
5.1%
Distinct34946
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:53.681867image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length15
Median length10
Mean length9.7160369
Min length6

Characters and Unicode

Total characters1770670
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18669 ?
Unique (%)10.2%

Sample

1st row$8,632.80
2nd row$8,918.38
3rd row$8,767.96
4th row$8,014.77
5th row$7,974.44
ValueCountFrequency (%)
18617
 
10.2%
654.00 676
 
0.4%
436.00 577
 
0.3%
468.70 441
 
0.2%
981.00 343
 
0.2%
813.14 305
 
0.2%
523.20 290
 
0.2%
414.20 286
 
0.2%
1,177.20 276
 
0.2%
512.30 268
 
0.1%
Other values (34936) 160163
87.9%
2024-11-05T17:53:54.164700image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
238093
13.4%
$ 182242
10.3%
. 163625
9.2%
, 150824
 
8.5%
1 127167
 
7.2%
6 103329
 
5.8%
4 102418
 
5.8%
5 101394
 
5.7%
7 99792
 
5.6%
3 99129
 
5.6%
Other values (5) 402657
22.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1770670
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
238093
13.4%
$ 182242
10.3%
. 163625
9.2%
, 150824
 
8.5%
1 127167
 
7.2%
6 103329
 
5.8%
4 102418
 
5.8%
5 101394
 
5.7%
7 99792
 
5.6%
3 99129
 
5.6%
Other values (5) 402657
22.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1770670
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
238093
13.4%
$ 182242
10.3%
. 163625
9.2%
, 150824
 
8.5%
1 127167
 
7.2%
6 103329
 
5.8%
4 102418
 
5.8%
5 101394
 
5.7%
7 99792
 
5.6%
3 99129
 
5.6%
Other values (5) 402657
22.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1770670
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
238093
13.4%
$ 182242
10.3%
. 163625
9.2%
, 150824
 
8.5%
1 127167
 
7.2%
6 103329
 
5.8%
4 102418
 
5.8%
5 101394
 
5.7%
7 99792
 
5.6%
3 99129
 
5.6%
Other values (5) 402657
22.7%

YR_BUILT
Real number (ℝ)

High correlation  Missing  Skewed 

Distinct236
Distinct (%)0.1%
Missing22786
Missing (%)12.5%
Infinite0
Infinite (%)0.0%
Mean1933.2157
Minimum1700
Maximum20198
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:54.343000image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1700
5-th percentile1880
Q11900
median1920
Q31965
95-th percentile2017
Maximum20198
Range18498
Interquartile range (IQR)65

Descriptive statistics

Standard deviation63.981908
Coefficient of variation (CV)0.033096104
Kurtosis41646.839
Mean1933.2157
Median Absolute Deviation (MAD)21
Skewness146.10676
Sum3.0826284 × 108
Variance4093.6845
MonotonicityNot monotonic
2024-11-05T17:53:54.518591image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1900 17988
 
9.9%
1920 12195
 
6.7%
1910 11790
 
6.5%
1905 10567
 
5.8%
1899 9866
 
5.4%
1890 8985
 
4.9%
1930 4176
 
2.3%
1999 3820
 
2.1%
1925 3785
 
2.1%
1880 3320
 
1.8%
Other values (226) 72964
40.0%
(Missing) 22786
 
12.5%
ValueCountFrequency (%)
1700 1
< 0.1%
1710 1
< 0.1%
1725 2
< 0.1%
1752 2
< 0.1%
1760 1
< 0.1%
1775 1
< 0.1%
1779 1
< 0.1%
1780 1
< 0.1%
1785 2
< 0.1%
1789 1
< 0.1%
ValueCountFrequency (%)
20198 1
 
< 0.1%
2023 8
 
< 0.1%
2022 340
 
0.2%
2021 1224
0.7%
2020 1613
0.9%
2019 1158
0.6%
2018 2437
1.3%
2017 2185
1.2%
2016 1603
0.9%
2015 1392
0.8%

YR_REMODEL
Real number (ℝ)

Missing  Skewed 

Distinct106
Distinct (%)0.1%
Missing95524
Missing (%)52.4%
Infinite0
Infinite (%)0.0%
Mean2002.0116
Minimum0
Maximum20220
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:54.699987image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1979
Q11987
median2005
Q32016
95-th percentile2021
Maximum20220
Range20220
Interquartile range (IQR)29

Descriptive statistics

Standard deviation65.386476
Coefficient of variation (CV)0.032660387
Kurtosis69535.907
Mean2002.0116
Median Absolute Deviation (MAD)12
Skewness248.07278
Sum1.7361045 × 108
Variance4275.3912
MonotonicityNot monotonic
2024-11-05T17:53:54.880211image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1985 3962
 
2.2%
2017 3910
 
2.1%
2021 3361
 
1.8%
2005 3207
 
1.8%
2019 3124
 
1.7%
1980 3112
 
1.7%
2018 3029
 
1.7%
2016 2893
 
1.6%
2022 2718
 
1.5%
2004 2711
 
1.5%
Other values (96) 54691
30.0%
(Missing) 95524
52.4%
ValueCountFrequency (%)
0 2
 
< 0.1%
201 2
 
< 0.1%
221 1
 
< 0.1%
1900 9
< 0.1%
1902 1
 
< 0.1%
1904 1
 
< 0.1%
1910 1
 
< 0.1%
1914 3
 
< 0.1%
1915 1
 
< 0.1%
1916 2
 
< 0.1%
ValueCountFrequency (%)
20220 1
 
< 0.1%
2921 1
 
< 0.1%
2121 1
 
< 0.1%
2023 118
 
0.1%
2022 2718
1.5%
2021 3361
1.8%
2020 2624
1.4%
2019 3124
1.7%
2018 3029
1.7%
2017 3910
2.1%

STRUCTURE_CLASS
Categorical

High correlation  Missing 

Distinct5
Distinct (%)< 0.1%
Missing164836
Missing (%)90.4%
Memory size1.4 MiB
C - Brick/Concr
10201 
D - Wood/Frame
4528 
B - Reinf Concr
1649 
A - Struct Steel
 
925
E - Metal
 
103

Length

Max length16
Median length15
Mean length14.757497
Min length9

Characters and Unicode

Total characters256869
Distinct characters27
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowD - Wood/Frame
2nd rowD - Wood/Frame
3rd rowD - Wood/Frame
4th rowD - Wood/Frame
5th rowD - Wood/Frame

Common Values

ValueCountFrequency (%)
C - Brick/Concr 10201
 
5.6%
D - Wood/Frame 4528
 
2.5%
B - Reinf Concr 1649
 
0.9%
A - Struct Steel 925
 
0.5%
E - Metal 103
 
0.1%
(Missing) 164836
90.4%

Length

2024-11-05T17:53:55.060661image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T17:53:55.207385image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
17406
31.8%
c 10201
18.6%
brick/concr 10201
18.6%
d 4528
 
8.3%
wood/frame 4528
 
8.3%
b 1649
 
3.0%
reinf 1649
 
3.0%
concr 1649
 
3.0%
a 925
 
1.7%
struct 925
 
1.7%
Other values (3) 1131
 
2.1%

Most occurring characters

ValueCountFrequency (%)
37386
14.6%
r 27504
10.7%
c 22976
8.9%
C 22051
 
8.6%
o 20906
 
8.1%
- 17406
 
6.8%
/ 14729
 
5.7%
n 13499
 
5.3%
B 11850
 
4.6%
i 11850
 
4.6%
Other values (17) 56712
22.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 256869
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
37386
14.6%
r 27504
10.7%
c 22976
8.9%
C 22051
 
8.6%
o 20906
 
8.1%
- 17406
 
6.8%
/ 14729
 
5.7%
n 13499
 
5.3%
B 11850
 
4.6%
i 11850
 
4.6%
Other values (17) 56712
22.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 256869
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
37386
14.6%
r 27504
10.7%
c 22976
8.9%
C 22051
 
8.6%
o 20906
 
8.1%
- 17406
 
6.8%
/ 14729
 
5.7%
n 13499
 
5.3%
B 11850
 
4.6%
i 11850
 
4.6%
Other values (17) 56712
22.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 256869
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
37386
14.6%
r 27504
10.7%
c 22976
8.9%
C 22051
 
8.6%
o 20906
 
8.1%
- 17406
 
6.8%
/ 14729
 
5.7%
n 13499
 
5.3%
B 11850
 
4.6%
i 11850
 
4.6%
Other values (17) 56712
22.1%

BED_RMS
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct19
Distinct (%)< 0.1%
Missing48765
Missing (%)26.8%
Infinite0
Infinite (%)0.0%
Mean3.1484376
Minimum0
Maximum21
Zeros3184
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:55.349988image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q34
95-th percentile8
Maximum21
Range21
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.1022194
Coefficient of variation (CV)0.66770242
Kurtosis2.2563752
Mean3.1484376
Median Absolute Deviation (MAD)1
Skewness1.3651924
Sum420244
Variance4.4193264
MonotonicityNot monotonic
2024-11-05T17:53:55.484026image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2 38002
20.9%
3 27552
15.1%
1 22279
12.2%
4 15328
 
8.4%
6 9957
 
5.5%
5 8124
 
4.5%
9 3275
 
1.8%
0 3184
 
1.7%
8 2297
 
1.3%
7 2208
 
1.2%
Other values (9) 1271
 
0.7%
(Missing) 48765
26.8%
ValueCountFrequency (%)
0 3184
 
1.7%
1 22279
12.2%
2 38002
20.9%
3 27552
15.1%
4 15328
8.4%
5 8124
 
4.5%
6 9957
 
5.5%
7 2208
 
1.2%
8 2297
 
1.3%
9 3275
 
1.8%
ValueCountFrequency (%)
21 1
 
< 0.1%
17 5
 
< 0.1%
16 2
 
< 0.1%
15 27
 
< 0.1%
14 54
 
< 0.1%
13 32
 
< 0.1%
12 351
 
0.2%
11 407
 
0.2%
10 392
 
0.2%
9 3275
1.8%

FULL_BTH
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct17
Distinct (%)< 0.1%
Missing11644
Missing (%)6.4%
Infinite0
Infinite (%)0.0%
Mean1.359758
Minimum0
Maximum21
Zeros36940
Zeros (%)20.3%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:55.617638image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0605678
Coefficient of variation (CV)0.77996806
Kurtosis3.1724521
Mean1.359758
Median Absolute Deviation (MAD)1
Skewness0.921087
Sum231972
Variance1.1248041
MonotonicityNot monotonic
2024-11-05T17:53:55.746799image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 64877
35.6%
2 45213
24.8%
0 36940
20.3%
3 19794
 
10.9%
4 2551
 
1.4%
6 580
 
0.3%
5 519
 
0.3%
7 65
 
< 0.1%
8 35
 
< 0.1%
9 12
 
< 0.1%
Other values (7) 12
 
< 0.1%
(Missing) 11644
 
6.4%
ValueCountFrequency (%)
0 36940
20.3%
1 64877
35.6%
2 45213
24.8%
3 19794
 
10.9%
4 2551
 
1.4%
5 519
 
0.3%
6 580
 
0.3%
7 65
 
< 0.1%
8 35
 
< 0.1%
9 12
 
< 0.1%
ValueCountFrequency (%)
21 1
 
< 0.1%
17 1
 
< 0.1%
15 2
 
< 0.1%
14 1
 
< 0.1%
13 3
 
< 0.1%
12 2
 
< 0.1%
10 2
 
< 0.1%
9 12
 
< 0.1%
8 35
< 0.1%
7 65
< 0.1%

HLF_BTH
Real number (ℝ)

Missing  Zeros 

Distinct8
Distinct (%)< 0.1%
Missing11509
Missing (%)6.3%
Infinite0
Infinite (%)0.0%
Mean0.22192546
Minimum0
Maximum7
Zeros135661
Zeros (%)74.4%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:55.861980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.46002771
Coefficient of variation (CV)2.0728929
Kurtosis5.6723827
Mean0.22192546
Median Absolute Deviation (MAD)0
Skewness2.1362235
Sum37890
Variance0.2116255
MonotonicityNot monotonic
2024-11-05T17:53:55.995713image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 135661
74.4%
1 32695
 
17.9%
2 1983
 
1.1%
3 361
 
0.2%
4 23
 
< 0.1%
5 7
 
< 0.1%
6 2
 
< 0.1%
7 1
 
< 0.1%
(Missing) 11509
 
6.3%
ValueCountFrequency (%)
0 135661
74.4%
1 32695
 
17.9%
2 1983
 
1.1%
3 361
 
0.2%
4 23
 
< 0.1%
5 7
 
< 0.1%
6 2
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
7 1
 
< 0.1%
6 2
 
< 0.1%
5 7
 
< 0.1%
4 23
 
< 0.1%
3 361
 
0.2%
2 1983
 
1.1%
1 32695
 
17.9%
0 135661
74.4%

KITCHENS
Real number (ℝ)

High correlation  Missing  Zeros 

Distinct6
Distinct (%)< 0.1%
Missing11718
Missing (%)6.4%
Infinite0
Infinite (%)0.0%
Mean1.0518813
Minimum0
Maximum5
Zeros36863
Zeros (%)20.2%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:56.126949image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile3
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8053745
Coefficient of variation (CV)0.76565153
Kurtosis0.7498689
Mean1.0518813
Median Absolute Deviation (MAD)0
Skewness0.8745803
Sum179371
Variance0.64862808
MonotonicityNot monotonic
2024-11-05T17:53:56.249683image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 102061
56.0%
0 36863
 
20.2%
2 17625
 
9.7%
3 13841
 
7.6%
4 133
 
0.1%
5 1
 
< 0.1%
(Missing) 11718
 
6.4%
ValueCountFrequency (%)
0 36863
 
20.2%
1 102061
56.0%
2 17625
 
9.7%
3 13841
 
7.6%
4 133
 
0.1%
5 1
 
< 0.1%
ValueCountFrequency (%)
5 1
 
< 0.1%
4 133
 
0.1%
3 13841
 
7.6%
2 17625
 
9.7%
1 102061
56.0%
0 36863
 
20.2%

TT_RMS
Real number (ℝ)

High correlation  Missing 

Distinct20
Distinct (%)< 0.1%
Missing48829
Missing (%)26.8%
Infinite0
Infinite (%)0.0%
Mean6.940583
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:56.372692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q14
median6
Q39
95-th percentile15
Maximum20
Range19
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.0097978
Coefficient of variation (CV)0.57773214
Kurtosis0.54651164
Mean6.940583
Median Absolute Deviation (MAD)2
Skewness1.1306302
Sum925964
Variance16.078479
MonotonicityNot monotonic
2024-11-05T17:53:56.514561image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
4 24165
13.3%
5 18593
 
10.2%
3 15940
 
8.7%
6 15592
 
8.6%
7 10547
 
5.8%
8 7496
 
4.1%
10 5715
 
3.1%
12 5159
 
2.8%
9 4870
 
2.7%
2 4729
 
2.6%
Other values (10) 20607
11.3%
(Missing) 48829
26.8%
ValueCountFrequency (%)
1 697
 
0.4%
2 4729
 
2.6%
3 15940
8.7%
4 24165
13.3%
5 18593
10.2%
6 15592
8.6%
7 10547
5.8%
8 7496
 
4.1%
9 4870
 
2.7%
10 5715
 
3.1%
ValueCountFrequency (%)
20 598
 
0.3%
19 183
 
0.1%
18 2387
1.3%
17 1286
 
0.7%
16 993
 
0.5%
15 4658
2.6%
14 3195
1.8%
13 2332
1.3%
12 5159
2.8%
11 4278
2.3%

BDRM_COND
Categorical

High correlation  Imbalance  Missing 

Distinct5
Distinct (%)< 0.1%
Missing110500
Missing (%)60.6%
Memory size1.4 MiB
A - Average
56687 
G - Good
13194 
E - Excellent
 
962
F - Fair
 
837
P - Poor
 
62

Length

Max length13
Median length11
Mean length10.437498
Min length8

Characters and Unicode

Total characters748807
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA - Average
2nd rowA - Average
3rd rowA - Average
4th rowA - Average
5th rowA - Average

Common Values

ValueCountFrequency (%)
A - Average 56687
31.1%
G - Good 13194
 
7.2%
E - Excellent 962
 
0.5%
F - Fair 837
 
0.5%
P - Poor 62
 
< 0.1%
(Missing) 110500
60.6%

Length

2024-11-05T17:53:56.670443image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-05T17:53:56.928213image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
71742
33.3%
a 56687
26.3%
average 56687
26.3%
g 13194
 
6.1%
good 13194
 
6.1%
e 962
 
0.4%
excellent 962
 
0.4%
f 837
 
0.4%
fair 837
 
0.4%
p 62
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
143484
19.2%
e 115298
15.4%
A 113374
15.1%
- 71742
9.6%
r 57586
7.7%
a 57524
7.7%
v 56687
 
7.6%
g 56687
 
7.6%
o 26512
 
3.5%
G 26388
 
3.5%
Other values (10) 23525
 
3.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 748807
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
143484
19.2%
e 115298
15.4%
A 113374
15.1%
- 71742
9.6%
r 57586
7.7%
a 57524
7.7%
v 56687
 
7.6%
g 56687
 
7.6%
o 26512
 
3.5%
G 26388
 
3.5%
Other values (10) 23525
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 748807
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
143484
19.2%
e 115298
15.4%
A 113374
15.1%
- 71742
9.6%
r 57586
7.7%
a 57524
7.7%
v 56687
 
7.6%
g 56687
 
7.6%
o 26512
 
3.5%
G 26388
 
3.5%
Other values (10) 23525
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 748807
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
143484
19.2%
e 115298
15.4%
A 113374
15.1%
- 71742
9.6%
r 57586
7.7%
a 57524
7.7%
v 56687
 
7.6%
g 56687
 
7.6%
o 26512
 
3.5%
G 26388
 
3.5%
Other values (10) 23525
 
3.1%

FIREPLACES
Real number (ℝ)

Missing  Zeros 

Distinct13
Distinct (%)< 0.1%
Missing49534
Missing (%)27.2%
Infinite0
Infinite (%)0.0%
Mean0.34416162
Minimum0
Maximum12
Zeros96980
Zeros (%)53.2%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:57.064973image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.68600397
Coefficient of variation (CV)1.9932611
Kurtosis22.273269
Mean0.34416162
Median Absolute Deviation (MAD)0
Skewness3.4407165
Sum45673
Variance0.47060145
MonotonicityNot monotonic
2024-11-05T17:53:57.203266image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 96980
53.2%
1 28983
 
15.9%
2 5085
 
2.8%
3 917
 
0.5%
4 359
 
0.2%
5 167
 
0.1%
6 114
 
0.1%
7 48
 
< 0.1%
8 34
 
< 0.1%
9 12
 
< 0.1%
Other values (3) 9
 
< 0.1%
(Missing) 49534
27.2%
ValueCountFrequency (%)
0 96980
53.2%
1 28983
 
15.9%
2 5085
 
2.8%
3 917
 
0.5%
4 359
 
0.2%
5 167
 
0.1%
6 114
 
0.1%
7 48
 
< 0.1%
8 34
 
< 0.1%
9 12
 
< 0.1%
ValueCountFrequency (%)
12 2
 
< 0.1%
11 4
 
< 0.1%
10 3
 
< 0.1%
9 12
 
< 0.1%
8 34
 
< 0.1%
7 48
 
< 0.1%
6 114
 
0.1%
5 167
 
0.1%
4 359
 
0.2%
3 917
0.5%

NUM_PARKING
Real number (ℝ)

High correlation  Missing  Skewed  Zeros 

Distinct23
Distinct (%)< 0.1%
Missing48623
Missing (%)26.7%
Infinite0
Infinite (%)0.0%
Mean1.3289427
Minimum0
Maximum210
Zeros58524
Zeros (%)32.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2024-11-05T17:53:57.340161image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum210
Range210
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.411295
Coefficient of variation (CV)1.8144461
Kurtosis1663.0013
Mean1.3289427
Median Absolute Deviation (MAD)1
Skewness29.679935
Sum177572
Variance5.8143437
MonotonicityNot monotonic
2024-11-05T17:53:57.480848image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0 58524
32.1%
1 29227
16.0%
2 21830
 
12.0%
3 8748
 
4.8%
4 8222
 
4.5%
6 3011
 
1.7%
5 2636
 
1.4%
8 631
 
0.3%
7 554
 
0.3%
10 95
 
0.1%
Other values (13) 141
 
0.1%
(Missing) 48623
26.7%
ValueCountFrequency (%)
0 58524
32.1%
1 29227
16.0%
2 21830
 
12.0%
3 8748
 
4.8%
4 8222
 
4.5%
5 2636
 
1.4%
6 3011
 
1.7%
7 554
 
0.3%
8 631
 
0.3%
9 88
 
< 0.1%
ValueCountFrequency (%)
210 1
 
< 0.1%
125 24
< 0.1%
56 1
 
< 0.1%
22 1
 
< 0.1%
20 1
 
< 0.1%
18 1
 
< 0.1%
17 1
 
< 0.1%
16 3
 
< 0.1%
14 5
 
< 0.1%
13 2
 
< 0.1%

Interactions

2024-11-05T17:53:41.462769image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:17.642980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:20.372219image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.344374image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.272029image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.229956image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.275541image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.291208image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.060963image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.859185image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.869884image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.670842image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.536112image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.582973image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:17.815319image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:20.531333image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.492946image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.402159image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.360078image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.402778image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.413550image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.180389image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.995840image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.000530image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.798909image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.661405image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.726473image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:18.010861image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:20.687360image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.647673image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.535058image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.509837image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.539494image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.557536image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.309747image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.143618image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.131261image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.954387image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.812232image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.858517image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:18.203616image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:20.846215image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.815937image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.678692image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.659641image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.680456image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.698219image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.455270image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.299464image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.274986image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.099361image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.948886image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.991210image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:18.403606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.003804image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.970641image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.824572image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.817960image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.826683image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.840198image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.597974image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.456917image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.421148image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.255941image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.081105image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.128059image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:18.593326image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.159143image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.119203image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.094894image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.961735image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.972980image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.981339image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.734797image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.604161image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.560935image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.398850image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.358157image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.275544image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:18.791432image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.323143image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.281963image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.250403image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.219698image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.118513image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.122818image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:32.883880image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.754451image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.706605image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.555505image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.513067image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.398975image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:19.101696image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.471424image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.426565image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.386240image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.362671image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.265656image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.249031image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.015506image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:34.894090image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.840554image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.689061image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.639370image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.523287image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:19.343606image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.598297image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.562633image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.519985image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.501425image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.423652image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.370613image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.141263image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.159811image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:36.973278image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.824655image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.769306image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.664006image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:19.524209image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.738643image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.707429image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.665181image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.649116image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.569772image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.510337image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.293849image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.304640image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.115839image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:38.974156image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:40.910058image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.793349image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:19.755007image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:21.876065image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.845822image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.802976image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.788488image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.703635image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.638316image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.435037image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.441704image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.243702image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.108096image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.048135image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:42.938188image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:19.929160image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.033136image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:23.998449image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:25.958768image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:27.997822image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:29.987415image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.791460image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.583334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.591825image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.404272image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.254806image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.190307image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:43.069398image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:20.089334image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:22.186665image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:24.134193image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:26.093921image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:28.139841image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:30.150768image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:31.927435image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:33.721832image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:35.732775image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:37.541902image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:39.400694image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-11-05T17:53:41.325245image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-11-05T17:53:57.620636image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
BDRM_CONDBED_RMSCITYFIREPLACESFULL_BTHGROSS_AREAHLF_BTHKITCHENSLIVING_AREANUM_BLDGSNUM_PARKINGRES_FLOORSTRUCTURE_CLASSTT_RMSYR_BUILTYR_REMODELZIP_CODE
BDRM_COND1.0000.0260.1480.0510.0641.0000.0690.0001.0000.0001.0000.0220.3640.1270.0000.0000.127
BED_RMS0.0261.0000.2060.0340.6590.8980.1710.6960.8730.0030.4490.7420.1380.940-0.1470.2770.123
CITY0.1480.2061.0000.0450.0780.0100.0740.1830.0100.0060.0260.0210.2590.2200.0000.0210.694
FIREPLACES0.0510.0340.0451.0000.0440.1030.253-0.1810.1030.0020.1250.0570.0990.054-0.0620.104-0.011
FULL_BTH0.0640.6590.0780.0441.0000.3110.1930.8590.3430.0170.2330.4330.0650.664-0.0600.2330.052
GROSS_AREA1.0000.8980.0100.1030.3111.0000.1410.2450.9700.0000.5190.7750.1020.925-0.1040.2430.070
HLF_BTH0.0690.1710.0740.2530.1930.1411.0000.1360.1430.0310.2200.2200.1030.1780.0910.1900.068
KITCHENS0.0000.6960.183-0.1810.8590.2450.1361.0000.2220.0070.2230.4170.2250.721-0.1650.0660.097
LIVING_AREA1.0000.8730.0100.1030.3430.9700.1430.2221.0000.0000.4490.7900.0960.900-0.1040.2480.004
NUM_BLDGS0.0000.0030.0060.0020.0170.0000.0310.0070.0001.0000.0000.0000.0000.0030.0000.0000.003
NUM_PARKING1.0000.4490.0260.1250.2330.5190.2200.2230.4490.0001.0000.3110.0390.4570.1420.1660.286
RES_FLOOR0.0220.7420.0210.0570.4330.7750.2200.4170.7900.0000.3111.0000.1220.767-0.1610.197-0.024
STRUCTURE_CLASS0.3640.1380.2590.0990.0650.1020.1030.2250.0960.0000.0390.1221.0000.1881.0000.0620.229
TT_RMS0.1270.9400.2200.0540.6640.9250.1780.7210.9000.0030.4570.7670.1881.000-0.1760.2820.125
YR_BUILT0.000-0.1470.000-0.062-0.060-0.1040.091-0.165-0.1040.0000.142-0.1611.000-0.1761.0000.0550.160
YR_REMODEL0.0000.2770.0210.1040.2330.2430.1900.0660.2480.0000.1660.1970.0620.2820.0551.0000.007
ZIP_CODE0.1270.1230.694-0.0110.0520.0700.0680.0970.0040.0030.286-0.0240.2290.1250.1600.0071.000

Missing values

2024-11-05T17:53:43.325855image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-05T17:53:44.045352image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-05T17:53:45.459421image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

CITYZIP_CODENUM_BLDGSLU_DESCBLDG_TYPERES_FLOORLAND_SFGROSS_AREALIVING_AREALAND_VALUEBLDG_VALUETOTAL_VALUEGROSS_TAXYR_BUILTYR_REMODELSTRUCTURE_CLASSBED_RMSFULL_BTHHLF_BTHKITCHENSTT_RMSBDRM_CONDFIREPLACESNUM_PARKING
0EAST BOSTON2128.01THREE-FAM DWELLINGRE - Row End3.01,1503353.02202.0197,600594,400792,000$8,632.801900.0NaNNaN6.03.00.03.012.0NaN0.03.0
1EAST BOSTON2128.01THREE-FAM DWELLINGRM - Row Middle3.01,1503047.02307.0198,500619,700818,200$8,918.381920.02000.0NaN3.03.00.03.09.0NaN0.00.0
2EAST BOSTON2128.01THREE-FAM DWELLINGRM - Row Middle3.01,1503392.02268.0199,100605,300804,400$8,767.961905.01985.0NaN5.03.00.03.013.0NaN0.00.0
3EAST BOSTON2128.01THREE-FAM DWELLINGRM - Row Middle3.01,1503108.02028.0199,700535,600735,300$8,014.771900.01991.0NaN5.03.00.03.011.0NaN0.00.0
4EAST BOSTON2128.01TWO-FAM DWELLINGRE - Row End3.02,0103700.02546.0230,200501,400731,600$7,974.441900.01978.0NaN6.03.00.02.013.0NaN0.00.0
5EAST BOSTON2128.01THREE-FAM DWELLINGDK - Decker3.02,5006278.04362.0263,8001,037,4001,301,200$14,183.081900.02018.0NaN13.06.00.03.020.0NaN0.00.0
6EAST BOSTON2128.01THREE-FAM DWELLINGDK - Decker3.02,5006432.04296.0264,7001,003,2001,267,900$13,820.111900.02009.0NaN14.05.00.03.020.0NaN0.00.0
7EAST BOSTON2128.01THREE-FAM DWELLINGDK - Decker3.02,5006048.04080.0265,300885,4001,150,700$12,542.631900.0NaNNaN11.03.00.03.016.0NaN0.00.0
8EAST BOSTON2128.01THREE-FAM DWELLINGDK - Decker3.02,5004339.02937.0265,900619,300885,200$9,648.681900.01998.0NaN5.03.00.03.014.0NaN0.00.0
9EAST BOSTON2128.01THREE-FAM DWELLINGDK - Decker3.02,5004659.03241.0226,700811,0001,037,700$11,310.931900.02020.0NaN6.03.00.03.014.0NaN0.00.0
CITYZIP_CODENUM_BLDGSLU_DESCBLDG_TYPERES_FLOORLAND_SFGROSS_AREALIVING_AREALAND_VALUEBLDG_VALUETOTAL_VALUEGROSS_TAXYR_BUILTYR_REMODELSTRUCTURE_CLASSBED_RMSFULL_BTHHLF_BTHKITCHENSTT_RMSBDRM_CONDFIREPLACESNUM_PARKING
182232BRIGHTON2135.01SINGLE FAM DWELLINGSD - Semi-Det2.03,7784240.02390.4289,500633,400922,900$10,059.611920.0NaNNaN7.02.01.01.010.0NaN2.03.0
182233BRIGHTON2135.01TWO-FAM DWELLINGCV - Conventional2.55,3334609.02951.6365,300750,0001,115,300$12,156.771920.0NaNNaN5.02.00.02.011.0NaN0.03.0
182234BRIGHTON2135.01CONDO MAINFS - Free Standing2.04,485NaNNaN000$-1999.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaN
182235BRIGHTON2135.01RESIDENTIAL CONDOFS - Free Standing2.02,7772777.01410.00545,100545,100$5,941.591920.0NaNNaN3.01.00.01.08.0A - Average1.01.0
182236BRIGHTON2135.01RESIDENTIAL CONDOFS - Free Standing1.01,4011401.01401.00494,800494,800$5,393.321920.0NaNNaN2.01.00.01.07.0A - Average1.01.0
182237BRIGHTON2135.01CITY OF BOSTON99 - VacantNaN5,931NaNNaN240,5000240,500$-NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN
182238BRIGHTON2135.01RES LAND (Unusable)99 - VacantNaN4,588NaNNaN72,800072,800$793.52NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN
182239BRIGHTON2135.01THREE-FAM DWELLINGCV - Conventional2.57,3804291.02834.4464,400850,5001,314,900$14,332.411920.01990.0NaN6.03.00.03.012.0NaN0.02.0
182240BRIGHTON2135.01STRIP CTR STORES319 - STRIP RETAIL/ OFFICENaN12,50014520.07260.0990,9001,458,8002,459,200$62,143.981947.02016.0C - Brick/ConcrNaN0.00.00.0NaNNaNNaNNaN
182241BRIGHTON2135.01OTHER EXEMPT BLDG973 - ADMINISTRATIVE BLDGNaN34,1257386.07386.02,138,6001,342,9003,489,000$-1900.0NaNC - Brick/ConcrNaN0.00.00.0NaNNaNNaNNaN

Duplicate rows

Most frequently occurring

CITYZIP_CODENUM_BLDGSLU_DESCBLDG_TYPERES_FLOORLAND_SFGROSS_AREALIVING_AREALAND_VALUEBLDG_VALUETOTAL_VALUEGROSS_TAXYR_BUILTYR_REMODELSTRUCTURE_CLASSBED_RMSFULL_BTHHLF_BTHKITCHENSTT_RMSBDRM_CONDFIREPLACESNUM_PARKING# duplicates
245BOSTON2111.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN074,60074,600$813.14NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN305
966BOSTON2118.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN060,00060,000$654.00NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN281
330BOSTON2114.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN038,00038,000$414.20NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN274
2762SOUTH BOSTON2127.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN040,00040,000$436.00NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN269
331BOSTON2114.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN047,00047,000$512.30NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN261
510BOSTON2115.01CONDO PARKING (RES)HR - High RiseNaNNaNNaNNaN060,00060,000$654.002015.0NaNNaNNaN0.00.00.0NaNNaNNaNNaN240
964BOSTON2118.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN048,00048,000$523.20NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN231
2776SOUTH BOSTON2127.01CONDO PARKING (RES)NaNNaNNaNNaNNaN043,00043,000$468.70NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN215
958BOSTON2118.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN040,00040,000$436.00NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN175
329BOSTON2114.01CONDO PARKING (RES)NoBldNaNNaNNaNNaN026,60026,600$289.94NaNNaNNaNNaN0.00.00.0NaNNaNNaNNaN147